In this project, I want to tell a data story about happy music songs. Music accompanys with us every day. It lifts us up when we are down, and sometimes move us to tear. I want to know more about happy music songs from the lyrics perspective.

1. What is the Happiest Genre?

To investigate the happiest genre, I started with a sentiment analysis on the given stemmed lyrics. Since each song contains a mixture of different sentiments. Let me define happy song first. In my opinion, when the joy sentiments takes the greatest portion in a song, I consider it as a happy song. The sentiments of the stemmed keyword is according to the NRC Lexion in this project.

song genre major.emotion
when-you-were-with-me Hip-Hop joy
careless-whisper Hip-Hop sadness
2-59 Hip-Hop anticipation
power-of-desire Hip-Hop fear
you-re-not-in-love Hip-Hop disgust

Then I made a bar plot, as above, to see how the songs of different genres distributed among the eight categories of emotions. From the plot, I noticed that joy is the most popular emotion in nearly all genres, except for Hip-Hop and Metal. And noticeably, Jazz is the happiest genre based on the given corpus of songs. 43.3% of Jazz songs express majorly joy emotion.

Then I want to investigate more and compare the top 2 happiest genre: Jazz and R&B, according to the plot above.

2. Focused on the top 2 happiest genres and their “joy” songs:

- Are they similar or different?

2.1 Overview of the most frequent word

To get an overall understanding of the “joy” emotion songs of Jazz and R&B genre, I first take a look at the most frequently used words through wordclouds.

For all genre with “joy” emotion:

For Jazz:

For R&B:

From the wordcloud plots, I found that for the joy category, Jazz and R&B used the similar key stemmed word to express joy. No matter it is seeing from one genre only or two genre together, love, baby, heart, etc all appear to be among the most frequent words in all the wordcloud plots above. This finding motivates me think about their corresponding most frequent topic under the joy category. Are these joy songs not only have same emotions, but also use similar topics? Are people singing about the same joyful topic for different genres?

With this question in mind, I then continue doing topic modelling to solve my question.

2.2 Topic modelling

Before doing topic modelling, I convert the stemmed word data from data frame back to Corpus to generate the document-term matrix. Then I adopted the LDA(Latent Dirichlet Allocation) method to allocate topics describing these joy Jazz and R&B songs. The number of topics selected is by trial and error such that meaningful topics can be inferred from the grouped keywords. In this case, I selected the topic number as seven.

Then using the above plot, I can see how Jazz and R&B songs distributed among the summarized topics under the Joy category. The most obvious difference between the topic distribution happens in Topic 2. Topic 2 can be manually hashtagged by its grouped most popular and salient terms, such as woman, babe, dear. Then I hashtag Topic 2 as “Love”.

For other topics, Jazz and R&B have roughly the same portion portion of songs written. In summary, the most noticeable difference is in the Love topic: R&B has more love songs than in Jazz, while these two genres have roughly the same percentage of happy songs on other topics. This high similarity in terms of topics also gave me evidence for the high similarity in the frequent keywords shown in wordclouds.

2.3 Lyrics Manifestation

Besides the keywords and topics of a song, the pattern of lyrics also play an important role in expressing happiness. In this section, I focused on the repetitiveness of lyrics. And I analysed the difference between Jazz and R&B joy songs from this perspective.

Why is the repetitiveness of lyrics important?
Repetition is very important in music, where melody and lyrics are often repeated. However, the level of the repetitiveness of songs varies a lot. For example, if a song is mostly narrative, then it is less repetitive than non-narrative songs. With high repetitiveness, even though storytelling ability is sacrificed, On the other hand, these repetitive lyrics assit memory and help drive market success.

Then let me investigate how repetitive are joy Jazz and R&B lyrics?

R&B and Jazz in term of happy songs, they have similar lyrics repetition ratio. On average, R&B songs has lightly over 47.2% of repetition, and Jazz music has 40.4% of repetition. R&B songs are more repetitive and easy to remember as a result of high repetitiveness. And thus it is easy to become popular. This could be used to support why R&B is far more popular than Jazz in the current music market.

```

Future Work:

Manually Hashtagged Topics in LDA:
some grouped keywords are still a bit hard to infer a reasonable topic with full confidence. So next time I will try to incorporate another topic modelling technique as a reference and see if there’s any insight or techniques that can be borrowed.

For this project, luckily, the topic of interest (Love) can be easily summarized through salient and most frequent keywords.

3. Conclusion

  1. The happiest genre is Jazz. Besides, joy is the most popular emotion among all genres, only except for Metal and Hip-Hop.

  2. Compare happy songs between Jazz and R&B:
  1. For happy songs, R&B lyrics are more repetitive than Jazz lyrics, which aids memory and popularity of the songs. This could probably be one of the reasons that R&B is more popular than Jazz.